# Self-supervised pretraining

Voc2vec
Apache-2.0
voc2vec is a foundational model specifically designed for non-linguistic human data, built on the wav2vec 2.0 framework, with a pretraining dataset covering approximately 125 hours of non-linguistic audio.
Audio Classification Transformers English
V
alkiskoudounas
223
2
Regnety 320.seer
Other
RegNetY-32GF feature extraction model, pretrained on 2 billion random web images using SEER method, suitable for image classification and feature extraction tasks
Image Classification Transformers
R
timm
19
0
Vit Msn Base 4
Apache-2.0
This Vision Transformer model is pretrained using the MSN method and excels in few-shot scenarios, suitable for tasks like image classification
Image Classification Transformers
V
facebook
62
1
Regnet Y 1280 Seer In1k
Apache-2.0
RegNet image classification model trained on ImageNet-1k using self-supervised pretraining and fine-tuning methods
Image Classification Transformers
R
facebook
18
1
Xlm Roberta Xxl
MIT
XLM-RoBERTa-XL is a multilingual model pretrained on 2.5TB of filtered CommonCrawl data covering 100 languages, based on an extra-large version of the RoBERTa architecture.
Large Language Model Transformers Supports Multiple Languages
X
facebook
13.19k
15
Core Clinical Mortality Prediction
The CORe model is based on the BioBERT architecture, specifically pretrained on clinical records, disease descriptions, and medical literature for predicting in-hospital mortality risk.
Text Classification Transformers English
C
DATEXIS
924
3
Beit Large Patch16 224
Apache-2.0
BEiT is an image classification model based on Vision Transformer (ViT) architecture, pretrained with self-supervised learning on ImageNet-21k and fine-tuned on ImageNet-1k.
Image Classification
B
microsoft
222.46k
1
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase